# Speech Recognition
Ultravox V0 5 Llama 3 2 1b GGUF
MIT
Ultravox v0.5 is an audio-to-text model optimized from the Llama-3 2.1B architecture, focusing on efficient speech transcription tasks.
Speech Recognition
U
ggml-org
421
1
Hubert Base Librispeech Demo Colab
Apache-2.0
A speech recognition model fine-tuned from facebook/hubert-large-ls960-ft, trained on the LibriSpeech dataset
Speech Recognition
Transformers

H
vishwasgautam
101
0
Wav2vec2 Base Librispeech Demo Colab
Apache-2.0
This model is a speech recognition model fine-tuned on the LibriSpeech dataset based on facebook/wav2vec2-base, achieving a word error rate of 0.3174 on the evaluation set.
Speech Recognition
Transformers

W
vishwasgautam
14
0
Wav2vec Checkpoints
Apache-2.0
A fine-tuned speech processing model based on facebook/wav2vec2-base, achieving 99.48% accuracy on the evaluation set
Speech Recognition
Transformers

W
Zeyadd-Mostaffa
19
0
Deepfake Audio Detection
Apache-2.0
A speech processing model further fine-tuned based on wav2vec2-base-finetuned, achieving 98.82% accuracy on the evaluation set
Speech Recognition
Transformers

D
motheecreator
1,468
7
Deepfake Audio Detection
Apache-2.0
A fine-tuned speech processing model based on wav2vec2-base-finetuned, achieving 98.82% accuracy on the evaluation set
Speech Recognition
Transformers

D
mo-thecreator
801
7
Wav2vec2 Phoneme
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, focusing on phoneme recognition tasks
Speech Recognition
Transformers

W
Bluecast
189
3
Wav2vec2 Base Finetuned
Apache-2.0
A speech processing model fine-tuned based on the facebook/wav2vec2-base model, achieving 99.97% accuracy on the evaluation set
Speech Recognition
Transformers

W
motheecreator
105
4
Wav2vec2 Base Finetuned
Apache-2.0
A speech processing model fine-tuned based on the facebook/wav2vec2-base model, achieving 99.97% accuracy on the evaluation set
Speech Recognition
Transformers

W
mo-thecreator
19
4
Wav2vec2 Base Finetuned Ks
Apache-2.0
An audio classification model fine-tuned on an audio folder dataset based on the wav2vec2-base model, achieving 99.82% accuracy on the validation set
Audio Classification
Transformers

W
motheecreator
54
3
Whisper Small Dialect Classifier Cross
Apache-2.0
This model is a dialect classifier based on the whisper-small architecture, designed to recognize and classify speech inputs of specific dialects.
Audio Classification
Transformers

W
yaygomii
53
1
Bsc Ai Thesis Torgo Model 1
Apache-2.0
A speech processing model fine-tuned based on facebook/wav2vec2-base, demonstrating excellent performance on the evaluation set
Speech Recognition
Transformers

B
Juardo
19
0
Neunit Ks Kangyuan0601
Apache-2.0
This model is a fine-tuned audio classification model based on facebook/wav2vec2-base on the superb dataset, achieving 99.87% accuracy on the evaluation set.
Audio Classification
Transformers

N
SHENMU007
16
0
Wav2vec2 Base Finetuned Amd
Apache-2.0
This model is a fine-tuned version of facebook/wav2vec2-base on an unknown dataset, primarily used for speech recognition tasks, achieving an accuracy of 84.55% on the evaluation set.
Speech Recognition
Transformers

W
justin1983
14
0
Audio Class Finetuned
Apache-2.0
This model is a fine-tuned audio classification model based on facebook/wav2vec2-base on the superb dataset, achieving an accuracy of 0.6578 on the evaluation set.
Audio Classification
Transformers

A
Chemsseddine
20
0
Wav2vec2 Base Finetuned Ks
Apache-2.0
A speech recognition model fine-tuned on the superb dataset based on facebook/wav2vec2-base, achieving 98.34% accuracy
Speech Recognition
Transformers

W
marcatanante1
13
0
Whisper Small ISSAI KSC 335RS V2
A small speech recognition model based on the Whisper architecture, suitable for domain-specific speech-to-text tasks
Speech Recognition
Transformers

W
Shirali
83
1
Englishmodel
Apache-2.0
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m, primarily used for English speech-to-text tasks.
Speech Recognition
Transformers

E
Foxasdf
24
1
Wav2vec2 Base Finetuned Ks
Apache-2.0
This model is a speech recognition model fine-tuned on the SUPERB dataset based on facebook/wav2vec2-base, demonstrating excellent performance in keyword spotting tasks.
Speech Recognition
Transformers

W
teoha
14
0
Wav2vec2 Base Finetuned Ie
Apache-2.0
A fine-tuned version based on facebook/wav2vec2-base model for specific tasks
Speech Recognition
Transformers

W
minoosh
14
0
Wav2vec2 Base Finetuned Ks
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving an accuracy of 87.27% on the evaluation set.
Speech Recognition
Transformers

W
FerhatDk
38
0
Wav2vec2 Base Timit Demo Google Colab
Apache-2.0
This model is a fine-tuned version of facebook/wav2vec2-base, primarily used for speech recognition tasks.
Speech Recognition
Transformers

W
ones
108
0
Wav2vec2 Base Timit Demo Google Colab
Apache-2.0
A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base
Speech Recognition
Transformers

W
Nancyzzz
103
0
Wav2vec2 Base Timit Demo Colab
Apache-2.0
A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model, featuring a low Word Error Rate (WER).
Speech Recognition
Transformers

W
nawta
96
1
Wav2vec2 Base Timit Demo Google Colab
Apache-2.0
A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, specializing in English speech-to-text tasks
Speech Recognition
Transformers

W
dasolj
127
0
Wav2vec2 Base Timit Demo Google Colab
Apache-2.0
This model is a speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, primarily used for English speech-to-text tasks.
Speech Recognition
Transformers

W
neweasterns
100
0
Wav2vec2 Base Ft Cv3 V3
Apache-2.0
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base on the Common Voice 3.0 English dataset, achieving a word error rate of 0.247 on the test set.
Speech Recognition
Transformers

W
danieleV9H
120
0
Wav2vec Trained
Apache-2.0
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, achieving a word error rate of 0.1042 on the evaluation set.
Speech Recognition
Transformers

W
eugenetanjc
70
0
Wav2vec Cv
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base-960h
Speech Recognition
Transformers

W
eugenetanjc
69
0
Wav2vec Mle
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base-960h, achieving a word error rate of 1.0 on the evaluation set
Speech Recognition
Transformers

W
eugenetanjc
68
0
Project NLP
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3355 on the evaluation set.
Speech Recognition
Transformers

P
zakria
22
0
Wav2vec2 Base Dataset Asr Demo Colab
Apache-2.0
This is a speech recognition model fine-tuned on the superb dataset based on distilhubert, primarily used for Automatic Speech Recognition (ASR) tasks.
Speech Recognition
Transformers

W
aminnaghavi
34
0
Test Demo Colab
This is an automatically generated test model, primarily for demonstration and experimental purposes.
Large Language Model
Transformers

T
YYSH
16
0
Wav2vec2 Base Timit Demo Google Colab
Apache-2.0
This model is a speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3384 on the evaluation set.
Speech Recognition
Transformers

W
mikeluck
38
0
Wav2vec2 Keyword Spotting Int8
A speech keyword detection model based on the wav2vec2 architecture, optimized with Optimum OpenVINO quantization
Speech Recognition
Transformers

W
sampras343
17
0
Wac2vec Lllfantomlll
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.3417 on the evaluation set.
Speech Recognition
Transformers

W
lllFaNToMlll
27
0
Wav2vec2 Base Vios Commonvoice 1
Apache-2.0
This model is a speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m, supporting automatic speech recognition tasks.
Speech Recognition
Transformers

W
tclong
21
0
Wav2vec2 Base Timit Demo Colab53
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base, suitable for the TIMIT dataset
Speech Recognition
Transformers

W
Mudassar
22
0
Wav2vec2 Final 1 Lm 4
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set
Speech Recognition
Transformers

W
chrisvinsen
16
0
Wav2vec2 Final 1 Lm 3
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set, which can be reduced to 0.126 when using a 4-Gram language model
Speech Recognition
Transformers

W
chrisvinsen
16
0
- 1
- 2
- 3
- 4
- 5
- 6
Featured Recommended AI Models